The discrepancy between mean reward and mean reinforcement

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimized Maximum Mean Discrepancy

We propose a method to optimize the representation and distinguishability of samples from two probability distributions, by maximizing the estimated power of a statistical test based on the maximum mean discrepancy (MMD). This optimized MMD is applied to the setting of unsupervised learning by generative adversarial networks (GAN), in which a model attempts to generate realistic samples, and a ...

متن کامل

Maximum Mean Discrepancy Imitation Learning

Imitation learning is an efficient method for many robots to acquire complex skills. Some recent approaches to imitation learning provide strong theoretical performance guarantees. However, there remain crucial practical issues, especially during the training phase, where the training strategy may require execution of control policies that are possibly harmful to the robot or its environment. M...

متن کامل

Testing Hypotheses by Regularized Maximum Mean Discrepancy

Do two data samples come from different distributions? Recent studies of this fundamental problem focused on embedding probability distributions into sufficiently rich characteristic Reproducing Kernel Hilbert Spaces (RKHSs), to compare distributions by the distance between their embeddings. We show that Regularized Maximum Mean Discrepancy (RMMD), our novel measure for kernel-based hypothesis ...

متن کامل

Comparison between the Mean Variance optimal and the Mean

5 We compare optimal liquidation policies in continuous time in the presence of trading impact using 6 numerical solutions of Hamilton Jacobi Bellman (HJB) partial differential equations (PDE). In par7 ticular, we compare the time-consistent mean-quadratic-variation strategy with the time-inconsistent 8 (pre-commitment) mean-variance strategy. We show that the two different risk measures lead t...

متن کامل

Image Analysis Applications of the Maximum Mean Discrepancy Distance Measure

The need to quantify distance between two groups of objects is prevalent throughout the signal processing world. The difference of group means computed using the Euclidean, or `2 distance, is one of the predominant distance measures used to compare feature vectors and groups of vectors, but many problems arise with it when high data dimensionality is present. Maximum mean discrepancy (MMD) is a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Psychonomic Science

سال: 1965

ISSN: 0033-3131,2197-9952

DOI: 10.3758/bf03343273